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(54) Method and system for enrolling addresses in a speech'recognition database 



(57) A method and system for enrolling speed dial 
names includes providing speaker dependent tem- 
plates and associated telephone numbers and providing 
a penalized garbage model for unrecognized speech. 
When a request for a new template is received it is de- 
termined if the list of speed dial names is full (Step 201 ) 
and is not it is determined if that name is too similar (Step 



205) to a name already on the speed dial list. If so. that 
name is rejected but if not it is determined if the speed 
dial name is too short (Step 302) : and if not too short or 
if the user wants to enter the short name the system asks 
the user to repeat the speed dial name and if a match it 
is entered. If not a match the system will swap the first 
and second utterance and compare to see if a match. 
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Description 

TECHNICAL FIELD OF THE INVENTION 

This invention relates to speech recognition and 
more particularly to enrollment of speech recognition 
addresses in a speech recognition database. 

BACKGROUND OF THE INVENTION 



w 



The enrollment of name addresses in a speech rec- 
ognition database is used in speed dialing. Speed dial- 
ing is where, for example, a certain number or bank of 
telephone numbers are pre-stored and the user only has 
to address that bank of numbers by saying a name to is 
have the telephone number called. It is highly desirable 
that the user speed dial by speaking the addresses by 
name into the telephone and the telephone number as- 
sociated with that name in the bank of telephone num- 
bers is dialed up. It is desirable therefore to provide 20 
some improved method and system for enrolling the 
speed dial name addresses into the telephone system 
so that the correct numbers will be dialed when spoken 
into the telephone system. 

25 

SUMMARY OF THE INVENTION 

In accordance with one preferred embodiment of 
the present invention a method and system for enrolling 
addresses as names in a speech recognition database 30 
is provided by providing a penalized garbage model for 
unrecognized speech, receiving a new utterance for en- 
rollment from a user and generating a template of the 
new utterance. A repeat of the utterance is then com- 
pared to the template to determine if the new utterance 35 
template should be entered into the database. 

In accordance with another preferred embodiment 
of the present invention a method and system for enroll- 
ing names in a speech recognition database includes a 
database with speaker dependent templates and penal- 40 
ized garbage model and comparing the name to be en- 
rolled to the names in the database to reject any name 
that is too similar. 

In accordance with another preferred embodiment 
of the present invention determining if the name to be 45 
enrolled into the database is too short before entering 
into the database. 



so 



DESCRIPTION OF THE DRAWINGS 

The present invention will now be further described, 
by way of example : with reference to the accompanying 
drawings in which: 



Fig. 1 illustrates a simplified block diagram of a tel- S5 
ephone system that implements a method of the 
present invention: 



Fig. 2 illustrates a flow diagram of a method for gen- 
erating multi-user spoken speed dial directions in 
the voice recognition telephone system: 

Fig. 3 illustrates a general flow diagram of enrolling 
and deleting a directory name within the telephone 
system: 

Fig. 4 illustrates a flow diagram of a method for en- 
rolling and modifying a speed dial list corresponding 
to a directory name in the telephone system; 

Fig. 5 is a flow chart of voice dial add entry accord- 
ing to one embodiment of the present invention; 

Fig. 6 illustrates a single garbage model; 

Fig. 7 is a flow chart of voice dial add entry enroll in 
Fig. 5: 

Fig. S is a flow chart of voice dial add entry update 
in Fig. 7: and 

Fig. 9 is a flow chart for voice dial add entry retry in 
Fig. 8. 

DETAILED DESCRIPTION OF THE INVENTION 

FIGURE 1 is a simplified block diagram of a tele- 
phone system 10 Telephone system 10 includes a tel- 
ephone 11 that connects to a processor 12. An off-hook 
detect circuit 1 3 and a recognition and record circuit 14 
connect to telephone 11 and processor 12. Processor 
1 2 also connects to a memory 1 5. In operation, off-hook 
detect circuit 13 informs processor 12 that telephone 11 
indicates an off-hook condition and allows processor 1 2 
to monitor commands according to a program stored 
within and executed by processor 1 2. The program with- 
in processor 12 allows a user to generate a directory 
name address and a speed dial list of entry names and 
corresponding phone numbers associated with the di- 
rectory name address. Telephone system 10 stores 
speaker dependent templates of the directory name ad- 
dress and associated entry names and phone numbers 
such that each user can access only this specific direc- 
tory name and speed dial list. 

FIGURE 2 is an initial flow diagram of a method for 
generating multi-user spoken speed dial directories in 
voice recognition telephone system 10. The processor 
12 in one embodiment is programmed according to this 
flow diagram. Off-hook detect circuit 13 of telephone 
system 10 monitors telephone 11 at step 16 to detect an 
off-hook condition on the specific telephone. Once de- 
tection of an off-hook condition occurs, processor 12 
prompts a user to input a command at step 17. At step 
18, processor 12 in conjunction with recognition and 
record circuit 1 4 which may include processor including 
a comparator and memory 15 compares the user's re- 
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sponsc lo one of a plurality of templates encoded into 
. memory 15 of telephone system 10. The flexibility of tel- 
ephone system 10 allows for receiving at step 18 either 
spoken */ords from a user or, in some instances, corre- 
sponding DTMF push button codes from telephone 11 
icprcscnttnq spoken command words. Throughout the 
dfnvv nqs an asterisk indicates that telephone system 
1 0 c«n recognize either spoken command words or cor- 
responding DTMF push button codes representing the 
command words. Asterisks also indicate that telephone 
system 10 performs speaker independent speech rec- 
ognition m matching a model to a user's response. For 
iliustMlivc purposes only, the description of the pre- 
leucd embodiment shall proceed as though the tele- 
pr>ooc ^ystcnr receives spoken responses instead of 
fcprc«^c^t:i ivc commands through corresponding DT- 
fvH pus* Dun on codes. 

a: ^ if to ephone system 10 may recognize one 
ot vri'toub command phrases and proceed according to 
Uiu fixiue^iou command. Telephone system 10 may 
i(x.v^jru/c * ic cp* tone number at step 1 9 received from 
a use /-i ^ s command. Telephone system 10 informs 
the we *: step 20 of the number received and the se- 
quence mii- continue to step 22 where the telephone 
numbc* *«-- be .lutomatically dialed in order to place the 
requottoJ c.*v. Tciophone system 10 may also recog- 
nize *n t-'vpincy command at step 24, such as "help", 
as a second command phrase received from the user. 
Telephone <.yciom 10 notifies the user at step 26 that 
the emo»gonry telephone number, such as 911 , is being 
dialed rind the sequence proceeds to step 22 where 
once HCMin the call will be placed. The telephone system 
may also recognize a third command phrase, CANCEL, 
from the user which automatically returns the telephone 
system to s:cp 1 7 ceasing any command sequence cur- 
rently in progress For example, as shown in FIGURE 
1, the user may halt the placement of a telephone call 
prior to a connection being made at the other end of the 
telephone line Though shown at only one point in FIG- 
URE 1 . the recognition of a CANCEL command at step 
28 may occur anywhere within the telephone system 
method described-in reference to subsequent figures. 

Telephone system 10 may recognize a fourth com- 
mand word at step 30 when the user requests to enter 
the user directory list. When telephone system 10 rec- 
ognizes this command, the process flows to step 32 to 
allow the user to enter the directory option. FIGURE 3 
illustrates a How diagram of the process steps in the di- 
rectory option portion of the telephone system program. 
Upon command recognition, the user enters the direc- 
tory option at step 34. To ensure that only authorized 
users may enter the directory option, telephone system 
10 implements security measures at steps 36 and 36, 
requiring the user to provide a verification of the author- 
ity to enter the directory option. The verification may be 
an authorization code that the user inputs into the sys- 
tem or there may be speaker dependent speech recog- 
nition templates to match the user's speech patterns to 



verification templates stored within telephone system 
10. 

At step 36, telephone system 10 prompts the user 
for the proper verification and recognizes the user's ver- 

5 ification response at step 38. Telephone system 1 0 may 
perform steps 36 and 38 more than one time as part of 
the verification process. If the telephone system does 
not recognize the verification code given by the user 
process flow returns to step 1 7 of FIGURE 2 in a similar 

10 manner as a CANCEL command. If telephone system 
10 recognizes a valid verification code, process flow 
continues to step 40 where telephone system 10 
prompts the user to input one of four commands for the 
directory option. Also see Kero, U. S. Patent No. 

15 5,369,685 for user verification. 

Once the user has provided the appropriate verifi- 
cation and enters the directory option, telephone system 
10 may recognize a first subcommand word at step 42 
to add a user directory name to the system. Telephone 

20 system 10 enrolls the user at step 44 by requesting a 
directory name and saving the user's response in a tem- 
plate at step 46 to be stored within the telephone sys- 
tem. In enrolling a user directory name, telephone sys- 
tem 10 may repeat steps 44 and 46 in order to create 

25 the template and save it with the existing list of user iden- 
tification templates already registered for that account 
or telephone. Once a template is saved, process flow 
returns to step 40 where telephone system 10 prompts 
the user for another command word. 

30 Telephone system 1 0 may recognize a second sub- 
command word at step 48 to delete a user directory 
name. When recognized, telephone system 10 prompts 
the user at step 50 for the name of the user directory to 
delete. Telephone system 10 recognizes the directory 

35 name given by the user at step 52 and requests the user 
to confirm the deletion of the directory name at step 54. 
If the user does not confirm deletion of the directory 
name, process flow returns to step 40 where telephone 
system 10 prompts the user for a command phrase. If 

40 the user does confirm deletion of the directory name at 
step 54, telephone system 10 deletes the template at 
step 56 created for that directory name and any tele- 
phone list entries corresponding to that directory name. 
Once deleted, process flow returns to step*40 where tel- 

45 ephone system 1 0 prompts the user for a new command 
phrase. 

Telephone system 10 may recognize a third sub- 
command phrase at step 58 to review the list of directory 
names. When recognized, telephone system 10 plays 

so the user directory list at step 50 before returning to step 
40 to request a new command phrase. Telephone sys- 
tem 10 may also recognize a fourth command phrase 
at step 62 : determining that the user has completed the 
directory option request. When recognized, process 

55 flow returns to step 17 of FIGURE 2 where telephone 
system 10 prompts the user for a telephone number. 

Returning to FIGURE 2, telephone system 10 may 
recognize a directory name at step 64 as a fifth com- 
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mand phrase. When telephone system 10 recognizes a 
user directory name, process flow proceeds to step 66 
where the telephone system enters a speed dial list op- 
tion. FIGURE 4 is a flow diagram of the speed dial list 
option process of the present invention. Telephone sys- 
tem 10 enters the speed dial list option at step 68 and 
subsequently prompts the user at step 70 to either re- 
quest a name to call or enter the list. When telephone 
system 10 recognizes an entry name at step 72. a 
prompt is given to the user at step 74, indicating the re- 
quested name to be called by telephone system 1 0. Tel- 
ephone system 1 0 then places the call at step 22 in Fl G- 
URE 2 as previously described. Telephone system 10 
may also recognize and enter a list command at step 76 
and prompt the user for one of five list command phras- 
es at step 78. 

Telephone system 10 may recognize a first list com- 
mand phrase at step 80 to add an entry name and phone 
number to the speed dial list under the user's directory 
name. When recognized, telephone system 10 prompts 
the user at step 82 to enroll the entry name into the 
speed dial list. Telephone system 10 prompts the user 
at step 83 to enroll a phone number corresponding to 
the entry name just enrolled at step 82 Telephone sys- 
tem 10 creates and saves a template corresponding to 
the name and phone number enrolled by the user at step 
84. Telephone system 10 may repeat steps 82 : 83 : and 
84 in order to verify and create a valid template of the 
entry name and phone number for the speed dial list 
Once saved, process flow returns to the beginning of 
the speed dial list option routine at step 70. 

Telephone system 10 may recognize a second list 
command at step 86 to modify a phone number corre- 
sponding to an entry name. When recognized, tele- 
phone system 1 0 prompts the user at step 88 to provide 
the name whose phone number is to be modified. Tele- 
phone system 1 0 recognizes the name given by the user 
at step 90 and allows the user to modify the telephone 
number corresponding to that name at step 92. Tele- 
phone system 10 saves a template of the modified 
number corresponding to the entry name with which 
modification was requested at step 94. Telephone sys- 
tem 10 may repeat steps 92 and 94 to ensure valid cre- 
ation of the telephone number template. Once the tem- 
plate is saved, process flow returns to step 70 as previ- 
ously described. 

Telephone system 10 may recognize a third list 
command at step 96 to delete a name from the-speed 
dial list. When recognized, telephone system 10 
prompts the user at step 98 for the name to be deleted 
from the speed dial list. Telephone system 10 recogniz- 
es the name at step 100 and requests the user to confirm 
deletion of the name at step 102. If the user does not 
wish to delete the entry name, process flow returns to 
step 70. If the user does confirm deletion of the entry 
name, the telephone system deletes the entry name 
template and corresponding phone number template at 
step 1 04 before routing the process flow back to step 70. 



Telephone system 10 may recognize a fourth list 
command phrase at step 106 to allow a user to review 
his speed dial list. When recognized, telephone system 
1 0 plays the user's speed dial list at step 1 08 and returns 
5 process flow back to step 70. 

Telephone system 10 may also recognize a fifth list 
command phrase at step 110, indicating that the user 
has completed the speed dial list option. When recog- 
nized, process flow returns to step 17 of Fig. 2 and the 
io method repeats as previously described. 

In summary, a telephone system may generate a 
separate directory for each authorized user of the tele- 
phone system. Each user may create a speed dial list 
containing names and phone numbers under the user's 
is own directory. By using speaker dependent features, no 
one can gain access to an authorized user's directory 
or speed dial list. The above is by way of background to 
enrollment of speed dial names using voice recognition. 

The processor 1 2 in Fig. 1 . according to one em- 
20 bodiment of the present invention, is programmed to op- 
erate according to the flow chart of Fig. 5 to enroll speed 
dial names into a speed dial list. The processor 12 in- 
cludes ELPC and ULPC counter and it allows a sub- 
scriber to create a base phrase and then update it. The 
2S subscriber is allowed three chances to say the spoken 
name to get it into a list in a manner to best recognize 
the spoken name. The system also addresses the prob- 
lem of the subscriber adding a name to the list that is 
either already on the list or add a very similar name to 
30 the list. It also addresses the problem of the subscriber 
saying the name too differently as it is enrolled and up- 
dated. 

Recent developments in the use of garbage models 
to determine out-of vocabulary speech have given rise 

35 to new recognition process that provide an out-of -vo- 
cabulary recognition capability as well as preserving a 
high rate of in -vocabulary recognition. This new recog- 
nition process utilizes a penalized garbage model in par- 
allel with spoken speed dialing names to discriminate 

40 out-of-vocabulary speech. This approach is applied to 
spoken speed dialing enrollment recognition to address 
the problem of enrollment of names already on a speed 
dial list and too much variability during enrollment. A 
"garbage model" is defined as a model for any speech 

45 which may be words or sounds for wh ich no other model 
exists within the recognition system. There are several 
possibilities for means of constructing garbage models. 
A single garbage model commonly used in state-of-lhe 
art recognition, shown in Fig. 6 models a collection of 

so broad phonetic classes of speech sounds which are 
linked too form sounds making up a word. As shown in 
Fig. 6 the circles represent the acoustic broad phonetic 
classes. The solid lines indicate transitions that may be 
made in either direction from one broad phonetic class 

55 to another. The dotted lines indicate that the model may 
loop on a particular state. Transitions are weighted by 
probabilities based on temporal phonotactic constraints. 
These constraints require that the longer a given pho- 
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netic class is used to explain speech, the less likely the 
class will be used to explain speech, the less likely the 
class will be used to explain subsequent speech, and 
the more likely subsequent speech will be explained by 
other different phonetic classes. The model may begin s 
explaining speech by entering or leaving at any state. 

During similar name checking, recognition is per- 
formed with the new name being added to the list. The 
new name can either match an existing name on the list, 
or match the parallel garbage model. If the name match- io 
es an existing name, then the user is informed that the 
name or a similar name is already on the list, and that 
the name will not be added. If the new name matches 
the parallel garbage model, then it is assumed that the 
name is not on the list and the addition of the name con- '5 
tinues. The penalty on the garbage model can be ad- 
justed to affect the sensitivity to matching either a name 
on the list or to the garbage model. 

The subscriber when trying to enroll a new name to 
the speed dial list enters a menu entitled "Voice Dial List 20 
Management" or position 40 in Fig. 3 or 78 in Fig. 4." 
and enters or says "Add Entry". When this command is 
recognized the system first checks at step 201 whether 
or not the list is full. If it is full, the system notifies user 
it is full. This can be done by a synthesized voice com- 2s 
mand from memory 15 and synthesizer 15a that states, 
"Your list is full. You must delete a name before adding 
a new one. " The user may return to the List Management 
and delete a name on the list See steps 96-104 in Fig. 
4. If the ist is full or after deleting a name on the list and 30 
returning to "Add Entry LPCCNT" ELPC and ULPC 
counters are set to zero (step 202). The system will then 
keep count of the Enrollment LPC (Linear Predictive 
Coding) or ELPC and the Update Linear Predictive Cod- 
ing (ULPC) counts. The LPC is a speech sample repre- 35 
sented by linear prediction parameters. LPC is assumed 
to be linear. For more on LPC, for example, see pages 
81-124 on "Lnear Predictive Coding of Speech' by 
Bishnu S. Itai (Chapter 4) in "Computer Speech 
Processing", edited by Frank Fallside and William 40 
Woods, Prentice Hall (ISBN 0-13-163841-6). If the 
count of ELPC and ULPC are both zero (step 203) indk 
eating that nothing has been entered before, a tutorial, 
synthesized prompt statement is played (decision "yes" 
at step 203). The synthesized statement may say *s 

"The system needs to learn how you say the name. 
There will be a long pause after you say the name the 
first Ume : and then the system will ask you to repeal the 
name between one and four times. In the future, you can 
skip this message by dialing pound. After the beep, 50 
please say the voice calling name." (a beep sounds at 
the end for the user to say the name ) The user says the 
name. A check is made at step 205 to determine it the 
name is already on the list of fifteen (for example) names 
listed or close to a name or matche s .he garbage model. 55 
If there is a match to that on the lis: (indicating a similar 
name already in the list) the system enters the Add Entry 
Retry of step 206 If out of retries is "yes" (step 207) then 
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the system goes back to the menu of voice djal list man- 
agement or position 40 in Fig. 3 or 78 in Fig. 4. If not out 
of tries ("no" at step 207) the system plays via the syn- 
thesizer 15a a "too similar" message ("... is too similar 
to another name on your list. Please choose a different 
name") and increments the ELPC counter 202 shown 
on Fig. 5 and uses the shorter prompt ("After the beep, 
please say the voice dialing name") to try again. If the 
user does not give a new name in time, the system times 
out, increments the counter and requests a new name. 
If a key on the keyboard is pressed that is not valid and/ 
or after a time out, the system increments the counter 
at step 202 and asks for a spoken name. If the system 
fails to enroll after three tries or fifth invalid DTMF key 
the system disconnects (step 209). If the spoken name 
is not matched the system goes to the "Add Entry Enroll" 
of Fig. 7. 

After a successful saying of a name that isn't 
matched at Add Entry Enroll in Fig. 5 the system follows 
the flow chart of Fig. 6. The utterance is stored when we 
start the on-line enrollment. The saved utterance is used 
to create a template (step 301 ) by performing an off-line 
enrollment. If the length of the utterance is not too short 
such as greater than or equal to a minimum threshold 
such as. for example, ten frames of data ("no" at step 
302) the system proceeds to step 305 to add entry up- 
date and follows the flow chart of Fig. 8. If the message 
is too short or less than the minimum threshold (less 
than ten (10) frames of data for the example) ("yes" at 
step 302), the system asks via the synthesizer if the user 
wants to use the template even if it hasn't been used 
before. If "yes" meaning less than the minimum thresh- 
old, the prompt message may state: 

"The name [name given] is shorter than the recom- 
mended name length. It is best to use both first and last 
names. To use this name anyway say OKAY. To cancel 
adding this name, say CANCEL." 

This is followed by a beep prompt. If "OKAY" is re- 
ceived at response step 307, the system proceeds to 
Add Entry Update of Fig. B. If "CANCEL" is received, a 
synthesized statement is generated and provided such 
as, "Name not added" and the system proceeds back to 
the Voice List Management Menu or position 40 in Fig. 
3 or 78 in Fig. 4. If nothing is said (time out), an unrec- 
ognized command or an incorrect key is pressed the 
system provides synthesized instructions and goes 
back to looking for a response. If after five times there 
is not a recognized response or after three lime outs, 
the system is disconnected with a message (step 309). 
If a DTMF key is pressed the synthesizer provides the 
message "Incorrect Key". After each time out, each 
wrong key and after the third and forth unrecognized 
voice command the synthesizer may state, 'Say OK or 
Cancel" or for more detailed instructions, "Say okay to 
continue adding this name. Say cancel to cancel adding 
this name." If "OKAY" is recognized even if a short 
name, the system proceeds to the Add Entry Update of 
Fig. 8. 
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In determining the recognize the system uses the 
garbage model with penalties listed below. 



> rhot, s1_rhot. 

> backv, s1_backv. 

> _frontv, s1_frontv. 
-> _fric, s1_fric. 
-> _nasal, s1_nasal. 

> _stop, s1_stop. 

> _sib, s1_sib. 

> _lowv, si lowv. 



start_garbage_pssd) 
_garbage_pssd, 0.6 
_garbage_pssd, 0.6 
_garbage_pssd, 0.6 
_garbage_pssd, 0.6 - 
_garbage_pssd, 0.6 - 
_garbagejpssd, 0.6 - 
_garbage_pssd, 0.6 - 
_garbage_pssd, 0.6 - 
s1_rhot, 6e-06 -->'"" 
s2_rhot, 0.06 -->**'. 
s3_rhot, 0.6 -->"". 
s1_backv, 6e-06 --> *" 
s2_backv, 0.06 --> 
s3_backv, 0.18 --> 
s4_backv, 0.3 --> 
s5_backv, 0.6 — > 
s1_frontv, 6e-06 --> " 
s2_frontv, 0.06 --> 
s3_frontv, 0.18 -> *"* 
s4_frontv, 0.3 --> *"\ 
sSJrontv, 0.6 -->"". 
s1_fric. 6e-06 --> "*\ 
s2_fric. 0.06 --> 
s3_fric. 0.18 --> ' 
s4_fric. 0.3 --> 
s5_fric. 0.6 --> 
s1_nasal, 6e-06 --> * 
s2_nasal, 0.06 -> 
s3_nasal 0.6 — > 
s1_stop, 6e-06 -> 
s2_stop, 0.06 ~> 
s3_stop, 0.6 -> 
s1_sib, 6e-06 --> 
s2_sib, 0.06 -->"". 
s3_sib, 0.18 -> "". 
s4_sib, 0.3 --> 
s5_sib, 0.6 -->"". 
sljowv, 6e-06 -->"*' 
s2_lowv, 0.06 --> 
s3_lowv, 0.18 --> ' 
s4_lowv, 0.3 — > 
sSJowv, 0.6 -->**". 



If the received template has a high score for any of 
the listed garbage models it receives a high score for 
unrecognizable speech and is rejected as unrecognized 
speech. 

Referring to Fig. 8 the template is downloaded (step 
401 ) and the system determines if there has been an 
update. During update, the garbage model is used to 
explain speech that is not in the enrollment template. 
For instance, if, during enrollment the user said "uh, 
Roger Rabbit", then the garbage model explains the 
"uh" (a gasp), and only the "Roger Rabbit" portion of the 
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update utterance is averaged into the new name tem- 
plate. If the update counter (ULPC) is zero (step 402). 
this means we have not done an update and the system 
requests the user via the synihesizer to. "Please say ihe 
name again." The update counter is incremented (step 
403) and when there is a response an update of the tem- 
plate (step 404) is made using that response. The tem- 
plate is checked to determine if a good update occurred. 
If a good update did occur the user is asked to enter the 
phone number for that name (step 405). This may be 
keyed in or spoken in using voice recognition with 
speaker-independent recognition models. If the update 
fails the system proceeds to Add Entry Retry steps of 
Fig. 9. If the update fails the enroll and update utteranc- 
es are swapped and the enrollment and update are at- 
tempted in that order. Often a user is not ready for 
speaking the first time and so an insertion such as "uh" 
(a gasp) might likely occur before the name is spoken 
but when we ask to say it again they are prepared to 
speak. The first template has the gasp of "uh" in it and 
when we do an update the update may fail because 
there is no "uh". When we swap utterances the cleaner 
second utterance is used for enrollment and we update 
with the first utterance, so the "uh" gasp on the begin- 
ning of the utterance is explained by the garbage model 
and the "uh M is not included in the template. If this swap- 
ping of the first and second utterance fails, a third utter- 
ance is requested via the out of tries (step 406) is re- 
quested and the response and the second utterance are 
used for the update. If a third utterance is requested for 
enrollment, then that name is checked first to see if it is 
too similar to another name on the list, if so it is not used, 
and processing proceeds to input A in Fig. 5. If the en- 
rollment fails because the utterance was too short, the 
system will notify the subscriber and re-prompt for an- 
other utterance. If the enrollment succeeds, but the ut- 
terance (frame length) is too short (is less than the min- 
imum length threshold), then the subscriber will be given 
a warning that poor recognition results may result be- 
cause the enrollment name is too short. The subscriber 
is prompted to say "OKAY" or "CANCEL". 

In summary, if an update fails, then the utterances 
are swapped, to see if the second utterance (or third if 
required) make a better enrollment utterance than the 
first. The following order if enrollment and updates is at- 
tempted, but only a maximum of three utterances are 
requested from the user. 
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Although the present invention and its advantages 
have been described in detail, it should be understood 
that various changes, substitutions and alterations can 
be made herein without departing from the spirit and 
scope of the invention. 



Claims 

1 . A method of enrolling speech recognition models in 
a speech recognition database comprising: 
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The method of Claim 6, further comprising the step 
of requesting and adding a telephone number to be 
associated with said the template. 

The method of Claim 6 or Claim 7, further compris- 
ing the step of requesting and receiving successive 
new speed dial name utterances, and if after swap- 
ping fails to get a compare on previous utterances, 
and the successive responses and a previous utter- 
ance are compared and if a match entering a tem- 
plate of a successive utterance to the database. 



providing a penalized garbage model to explain 
extraneous speech; 

receiving a new speech recognition utterance 
for enrollment from a user; 
generating a template of said received utter- 
ance for enrollment; 

requesting the user to repeat the utterance 
again to be enrolled; receiving a second re- 
ceived utterance; comparing the second utter- 
ance to the generated template and the penal- 
ized garbage models to determine if a match: 
and 

adding said new template to a speed dial list if 
a match as to in-vocabulary speech. 

2. The method of Claim 1 , wherein the comparison 
step comprises the step of comparing said second 
utterance to said penalized garbage model for re- 
jecting in said second utterance any part thereof 
that matches within a predetermined degree said 
penalized garbage model. 

3. The method of Claim 1 or Claim 2, further compris- 
ing the step of swapping the template and said sec- 
ond received utterance if the comparison fails to 
match and repeating the comparing step. 

4. The method of Claim 3 : comprising the step of re- 
questing and receiving a third utterance if after the 
swapping step the comparison fails and the third re- 
sponse and the second utterance are compared 
and if a match, and entering a template of the sec- 
ond utterance to the database. 

5. The method of Claim 3 or Claim 4 T further compris- 
ing the step of requesting successive utterances, if 
after swapping fails to get a compare on previous 
utterances, and the successive response with a 
previous utterance are compared and if a match en- 
tering a template of successive utterance to the da- 
tabase. 

6. The method of Claims 1 to 5, wherein the step of 
receiving speech recognition utterance comprises 
receiving a speed dial name. 



9. A method of enrolling received new speech recog- 
nition utterances in a database comprising the 

'5 steps of: 

providing speaker dependent templates of re- 
ceived utterances; 

providing a penalized garbage model to explain 

20 extraneous speech; and 

determining if the utterance to be enrolled 
matches either a previously provided speaker 
dependent template or said penalized garbage 
model and if matches previously, provided 

25 speaker dependent template then rejecting the 

enrollment. 

1 0. The method of Claim 9 : further comprising receiving 
speech recognition utterances for name addresses 

30 

11. The method of Claim 9, further comprising receiving 
speech recognition utterances for speed dial names 
and associated telephone numbers. 

35 12. The method of Claim 10 or Claim 11 comprising the 
step of: 

determining if the utterance to be enrolled is 
less than a minimum length threshold. 

40 13. Themethodof Claim 12, wherein if the utterance is 
less than said minimum length threshold determin- 
ing the user's approval before adding the template 
of the utterance to the database. 

45 14. A method of enrolling addresses in a speech rec- 
ognition database comprising the steps of: 

providing speaker dependent templates of ad- 
dresses; 

so providing a penalized garbage model for unrec- 

ognized speech; 

receiving the address to be enrolled; and 
determining if the address to be enrolled is too 
short. 

55 

15. A method of enrolling speed dial names in a tele- 
phone system comprising: 
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providing a speaker dependent templates of 
speed dial names and associated telephone 
numbers; 

providing a penalized garbage mode! for unrec- 
ognized speech; 

receiving a new speed dial name utterance for 
enrollment from a user; 

generating a template of said received speed 
dial name utterance for enrollment; 
determining if the name to be enrolled is too 
similar to a provided speaker dependent tem- 
plate and if too similar rejecting the enrollment 
or if matching the garbage model, then allowing 
the enrollment; 

determining if the utterance to be enrolled is 
less than a minimum length threshold and it the 
utterance is less than said minimum length 
threshold determining the user's approval be- 
fore to adding the template of the utterance to 
the speed dial list; 

requesting the user to repeat the new speed di- 
al name utterance again to be enrolled, 
receiving a second received new speed dial 
name utterance; 

comparing the second new speed dial name ut- 
terance to the generated template and the pe- 
nalized garbage models to determine if a 
match; and 

adding said new speed dial name template to 
-a speed dial list if a match. 

16. The method of Claim 15 including the step of re- 
questing and adding a telephone number to be as- 
sociated with said new speed dial name template. 

17. The method of Claim 15 or Claim 16, including the 
step of swapping the template and said second re- 
ceived speed dial name utterance if the comparison 
fails to match and repeating the comparing step. 

18. The method of Claims 15 to 17, including the step 
of requesting and receiving a third new speed dial 
name utterance if after the swapping step fails to 
get a compare and the third response and the sec- 
ond utterance are compared and if a match entering 
the second utterance to the speed dial list. 



21 



w 



15 



plates of the utterances; 

a memory storing a penalized garbage model 
for unrecognized speech; and 
a comparator for comparing the utterance to be 
enrolled to said stored speaker dependent tem- 
plate or said penalized garbage model for re- 
jecting the enrollment if too similar. 

The telephone apparatus of Claim 20 including 
means for determining if the utterance to be en- 
rolled is less than a minimum length threshold or if 
the utterance is less than said minimum length 
threshold determining the user's approval before 
adding the template of the utterance to a speed dial 
list. 



22. The telephone apparatus of Claim 20 or Claim 21 , 
wherein the speech recognition utterance is a 
speed dial name and associated telephone num- 

20 bers. 

23. A telephone apparatus for enrolling speed dial 
names comprising: 



25 



30 



35 



40 



a storage device storing a penalized garbage 
model for unrecognized speech; 
a receiver for receiving a new speed dial name 
utterance for enrollment from a user; 
a generator coupled to said receiver for gener- 
ating a template of said received speed dial 
name utterance for enrollment; 
means for requesting the user to repeat the new 
speed dial name utterance again to be enrolled: 
said receiver in response to receiving said sec- 
ond received new speed dial name utterance 
comparing the second new speed dial name ut- 
terance to the generated template and the pe- 
nalized garbage models to determine if a 
match; and 

means for adding said new speed dial name 
template to a speed dial list if a match. 



24. The telephone apparatus of Claim 23 including 
means for requesting and adding a telephone 
45 number to be associated with said new speed dial 
name template. 



19. The method of Claims 15 to 16, wherein the com- 
parison step includes the step of comparing said 
second utterance to said penalized garbage model 
for rejecting in said second utterance any utterance 
that matches within a predetermined degree said 
penalized garbage model. 

20. A telephone apparatus for enrolling a speech rec- 
ognition utterance comprising: 

a memory storing speaker dependent lem- 



55 



25. The telephone apparatus of Claim 23 or claim 24, 
wherein said receiver includes means for swapping 
the template and said second received speed dial 
name utterance if the comparison fails to match and 
for again comparing. 

26. The telephone apparatus of Claim 25 including 
means for requesting and receiving a third new 
speed dial name utterance if after the swapping 
step fails to get a compare and the third response 
and the second utterance are compared and if a 
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match entering the second utterance to the speed 
dial list. 

27. The telephone apparatus of any of Claims 23 to 26, 
wherein said receiver includes means for compar- 5 
ing said second utterance to said penalized gar- 
bage model for rejecting in said second utterance 
any utterance that matches within a predetermined 
degree said penalized garbage model. 

10 

28. An apparatus for enrolling speech templates in a 
speech recognition database comprising: 

a storage device storing a penalized garbage 
model for unrecognized speech; is 
a receiver for receiving a new speech address 
utterance for enrollment in said database from 
a user; 

a generator coupled to said receiver for gener- 
ating a template of said received speech ad- 20 
dress utterance for enrollment; 
means for requesting the user to repeat the new 
speech address utterance again to be "enrolled: 
said receiver in response to receiving said a 
second received new speech address utter- 25 
ance comparing the second utterance to the 
generated template and the penalized garbage 
models to determine if a match; and 
means for adding said new template to a said 
database if a match 30 

said new speed dial name template. 

29. The telephone apparatus of Claim 28 wherein said 
receiver includes means lor swapping the template 35 
and said second received utterance if the compar- 
ison fails to match and for again comparing. 

30. The telephone apparatus of Claim 29 including 
means for requesting and receiving a third utter- *o 
ance if after the swapping step fails to get a com- 
pare and the third response and the second utter- 
ance are compared and if a match entering the sec- 
ond utterance to the database. 

45 

31 . The telephone apparatus of any of Claims 28 to 30, 
wherein said receiver includes means for compar- 
ing said second utterance to said penalized gar- 
bage model for rejecting in said second utterance 
any utterance that matches within a predetermined so 
degree said penalized garbage model. 
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